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Abstract In this paper, we establish convergence the- 
orems for the Non-Local Means Filter in removing the 
additive Gaussian noise. We employ the techniques of 
" Oracle" estimation to determine the order of the widths 
of the similarity patches and search windows in the 
aforementioned hlter. We propose a practical choice of 
these parameters which improve the restoration quality 
of the filter compared with the usual choice of param- 
eters. 

Keywords Non-Local Means • Gaussian noise • 
"Oracle" estimator • Mean Squared Error • weighted 
means 

1 Introduction 



We deal with the additive Gaussian noise model: 
Y(x) = f(x) + e(x), as el, 
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where I is the uniform N x N grid of pixels on the unit 
square, Y = (Y (x)) xEl is the observed image bright- 
ness, / : [0,1] 2 — > R+ is an original image (unknown 
target regression function) and e = (e (x)) xeI are in- 
dependent and identically distributed (i.i.d.) Gaussian 
random variables with mean and standard deviation 
a > 0. 

Important denoising techniques for the model (1) 
have been developed in recent years. A very significant 
step in these developments was the introduction of the 
Non-Local Means Filter by Buades et al [1] . For closely 
related works, see for example [2-6]. 

The basic idea of the filters by weighted means is 
to estimate the unknown image f{xo) by a weighted 
average of observations Y(x) of the form 



fw{xo) = ^2 w(x)Y(x) 



(2) 



xeu x 



where for each x$ and h > 0, \J X0 ,h denotes a square 
window with center xq and width 2h, w(x) are some 
non-negative weights satisfying X) x eu h w ( x ) — 1- The 
choice of the weights w(x) is usually based on two cri- 
teria: a spatial criterion so that w(x) is a decreasing 
function of the distance between x and xq, and a sim- 
ilarity criterion so that w(x) is also a decreasing func- 
tion of the brightness difference \Y(x) — Y(xq)\ (see e.g. 
[7, 8]), which measures the similarity between the pix- 
els x and xq. In the Non-Local Means Filter, h > 
can be chosen relatively large, and the weights w(x) 
are calculated according to the similarity between data 
patches Y X>T) — (Y(y) : y £ U X)7) ) (identified as a vector 
whose composants are ordered lexicographically) and 
Y Xo ,7) = (Yd/) '■ V € \J XQ ,ri), instead of the similarity 
between just the pixels x and xq. Here rj > is the size 
parameter of data patches. 



2 



Qiyu Jin et al. 



The Non-Local Means Filter was further enhanced 
for speed in subsequent works by Mahmoudi and Sapiro 
(2005 [9]), Bilcu and Vehvilainen (2007 [10]), Karnati, 
Uliyar and Dey(2009 [11]), and Vigncsh, Oh and Kuo 
(2010 [12]). Other authors as Kervrann and Boulanger 
(2006 [13], 2008 [3]), Chatterjee and Milanfar (2008 
[14]), Buades, Coll and Morel (2006 [15]), Dabov, Foi, 
Katkovnik and Egiazarian (2007 [16], 2009 [17]) make 
the Non-Local method better. Thacker, Bromiley and 
Manjn (2008 [18]) investigate this basis in order to un- 
derstand the conditions required for the use of Non- 
Local means, testing the theory on simulated data and 
MR images of the normal brain. Katkovnik, Foi, Egiazar- 
ian and Astola (2010 [5]) review the evolution of the 
non-parametric regression modeling in imaging from 
the local Nadaraya- Watson kernel estimate to the Non- 
Local means and further to transform-domain faltering 
based on Non-Local block-matching. 

Unfortunately, the ideal implementation of Non-Local 
Means is computationally expensive. Therefore, for the 
sake of rising the speed of denoising, only a neighbor- 
hood of the estimated point is considered. In practice, 
the similarity patches of size 7 x 7 or 9 x 9 and search 
windows of size 19 x 19 or 21 x 21 are often chosen. 
However these choices are empirical and the problem 
of optimal choice remains open. As a consequence, the 
results of the numerical simulations are not always sat- 
isfactory. 

In this paper, we use the statistic estimation and 
optimization techniques to give a justification of the 
Non-Local Means filter, and to suggest the order of sizes 
of search window and similarity patch. Our main idea 
is to minimize a tight upper bound of the L 2 risk 

R (fw(xo)) = E (fw(xo) - f(xo)) 2 

by changing the width of the search window. We first 
obtain an explicit formula for the optimal weights w\ 
in terms of the unknown function /. The corresponding 
weighted mean is called " Oracle" ; the " Oracle" f£ is 
shown to have an optimal rate of convergence and high 
performance in numerical simulations. To mimic the 
" Oracle" , we estimate w\ by some adaptive weights 
Qh based on the observed image Y. We thus obtain the 
Non-Local Means Filter with the proper width of win- 
dow. Numerical results show that the Non-Local Means 
Filter with proper width of window outperforms the 
Non-Local Means Filter with standard choice. 

The paper is organized as follows. In Section 2, we 
introduce the " Oracle" estimator of Non-Local Means 
Filter and reconstruct Non-Local Means Filter with the 
idea of " Oracle" theory. Our main theoretical results are 
presented in Section 3 where we give the rate of con- 
vergence of the Non-Local Means Filter. In Section 4, 



we present our simulation results with a brief analysis. 
Section 5 gives the conclusion of our paper. Proofs of 
the main results are deferred to Section 6. 



2 Main results 

2.1 Notations 

Let us set some notations to be used throughout the pa- 
per. The Euclidean norm of a vector x — (xi, € 

R d is denoted by ||a;|| 2 = xf^j 2 . The supremum 

norm of x is denoted by ||x||oo = sup 1<i<d \xt \ . The 
cardinality of a set A is denoted by card A. For a posi- 
tive integer N, the uniform N x N grid of pixels on the 
unit square is defined by 

f 1 2 N - 1 1 2 

I= {]^'-'— ' (3) 

Each element x of the grid I will be called pixel. The 
number of pixels is n — N 2 . For any pixel xq <G I and a 
given h > 0, the square window of pixels 

U IO , fc = {xeI:||x-a;o||oo<M (4) 

will be called search window at Xq. We naturally take h 
as a multiple of jj (h = -j| for some k e {1, 2, • • • , N}). 
The size of the square search window U Xo ,h is the pos- 
itive integer number 

M = (2Nh + l) 2 = card U xo , fc . (5) 

For any pixel x € U Xo ,h an d a given r\ > 0, a second 
square window of pixels U x-)) will be called "patch at x. 
Like h, the parameter r\ is also taken as a multiple of 
jj. The size of the patch U x ^ is the positive integer 

m = (2A77 + I) 2 =card U XOjJ) . (6) 

The vector Y XjTj = (Y {y)) y€ u x formed by the values 
of the observed noisy image Y at pixels in the patch 
Ui,, will be called simply data patch at x € U Xo ,h- 

2.2 the "Oracle" of Non-Local means 

In order to study statistic estimation theory of the Non- 
Local Means algorithm, we introduce an "Oracle" es- 
timator (for details on this concept see Donoho and 
Johnstone (1994 [19])) of Non-Local means. Denote 

r h = E < Y ^ ( 7 ) 
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where 



w* h (x)=e I E e xeV XOth , 

(8) 

Pf>X0 {x) = \f{x)-f{x Q )\ (9) 
and H > is a constant. It is obvious that 



Theorem 1 Assume that h = [j^jj n"^+' 

H > \[2LhP . Suppose that the function f satisfies the 
local Holder condition (14) and fh(xo) is given by (7). 
Then 



E w h( x ) = 1 and w h( x ) > °- 



(10) 



Note that the function pf, Xo (x) > characterizes the 
similarity of the image brightness at the pixel x with 
respect to the pixel x 0l therefore we shall call Pf. xo sim- 
ilarity function. The usual bias- variance decomposition 
of the Mean Squared Error (MSE) 



E(/(.to)-.4*(xo)) 2 

= ( E <(*)(/(*) -/(so))) +<? 2 E <^ 2 
<( E wux)\m-f(x )\\ +v 2 E <(*) 2 

(11) 

The inequality (11) combining with (9) implies the fol- 
lowing upper bound 



E(f(x )-f*(x )f<g(w* h ) 1 

where 



(12) 



9(w)=\ E w ( x )Pf,xo(x)\ +0- 2 E w ( x f 



(13) 



We shall define a family of estimates by minimizing the 
function g (w^) in w\ and plugging the optimal weights 
into (7). We shall consider the local Holder condition 

\f{x) - f(y)\ <L\\x- vWL Vz, y e V XOth+v , (14) 

where /? > and L > are constants, h > 0, 77 > 
and xq € I. The following theorem gives the rate of 
convergence of the " Oracle" estimator and the proper 
width h of the search window. 



2g 



(15) 



20 + 6 4ff 4 

E(ft(xo) f(x )f < n 

/3W+2 

For the proof of this theorem see Section 6.1. 

We confirm the theorem by simulations that the dif- 
ference between the "Oracle" f^{xo) and the true value 
f(xo) is extremely small (see Table 1 and the definition 
of PSNR can be found in Section 4 ). The latter, at least 
from the practical point of view, the theorem justifies 
that it is reasonable to optimize the upper bound giw^) 
instead of optimizing the risk E (f^(xo) — f(x )) 2 itself. 

Theorem 1 displays that the choice of a small search 
window, in the place of the whole observed image, suf- 
fices to ensure a denoising without loss of visual quality, 
and explains why we take a small search window for the 
simulations in the Non-Local Means algorithm. 



2.3 Reconstruction of Non-Local Means filter 

With the theory of "Oracle" estimator, we reconstruct 
the Non-Local Means filter [1]. Let h > and r\ > 
be fixed numbers. For any x € I and any x e 
U^h, the distance between the data patches Y x , v = 
( Y (y)) y eu x , n and Y *o,» = (Y (y)) ye v xo , n is defined by 



d 2 (Y Y ) 



\Y — Y II 

- 1 x.rj 1 x ,ri\\2 ) 



where ||Y XjJ) - Y X0iJ) ||2 - — 



£ E (Y(T x y)-Y(y)) 2 , 
yeu Xi7J 

T x is the translation mapping: T x y = x + (y — x ) and 
m is given by (6), which measures the similarity be- 
tween the data patches Y x>r) and Y Xo _ n . Since \f(x) — 
.f(x )\ 2 = E|Y(a;) - F(a; )| 2 - 2cr 2 , an obvious estima- 
tor of E\Y(x) -Y(x )\ 2 is given by d 2 (Y XtV , Y Xa . v ). 
Define an estimated similarity function p XQ by 



p 2 Xa {x)=d 2 (Y x . v ,Y X0 , v )~2a 2 1 
and an adaptive estimator fh by 
fh(x ) = E w h (x)Y(x), 

x£Ux ,h 

where 



(16) 
(17) 



Wh = e h 



E 



e n z 



£c'eu XOjh 



^/ E e 



. (18) 
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Table 1 PSNR values when "Oracle" estimator is applied with different values of M. 



image 


Lena 


Barbara 


Boats 


House 


Feppcrs 


Size 


512 X 512 


512 X 512 


512 X 512 


256 X 256 


256 X 256 


a/PSNR 


10/28.12db 


10/28. 12db 


10/28.12db 


10/28. lldb 


10/28. lldb 


9x9 


38.98db 


37.26db 


37.66db 


38.93db 


37.85db 


11 X 11 


40.12db 


38.49db 


38.80db 


40.04db 


38.85db 


13 x 13 


41.09db 


39.55db 


39.78db 


40.98db 


39.64db 


15 x 15 


41.92db 


40.45db 


40.63db 


41.77db 


40.39db 


17 x 17 


42.64db 


41.23db 


41.39db 


42.40db 


41.00db 


19 x 19 


43.29db 


41.93db 


42.06db 


43.06db 


41.58db 


21 x 21 


43.88db 


42.57db 


42.67db 


43.61db 


42.14db 


a/PSNR 


20/22.11db 


20/22. lldb 


20/22. lldb 


20/28. 12db 


20/28. 12db 


9x9 


33.6ldb 


3l.9ldb 


32.32db 


33.72db 


32.62db 


11 X 11 


34.78db 


33.20db 


33.49db 


34.92db 


33.65db 


13 x 13 


35.80db 


34.28db 


34.49db 


35.98db 


34.51db 


15 x 15 


36.69db 


35.22db 


35.40db 


36.80db 


35.26db 


17 x 17 


37.48db 


36.05db 


36.20db 


37.48db 


35.89db 


19 x 19 


38.17db 


36.74db 


36.90db 


38.07db 


36.45db 


21 x 21 


38.80db 


37.40db 


37.54db 


38.67db 


36.98db 


a/PSNR 


20/22.11db 


20/22. lldb 


20/22. lldb 


20/28. 12db 


20/28. 12db 


9x9 


30.65db 


28.89db 


29.25db 


30.69db 


29.51db 


11 x 11 


31.83db 


30.23db 


30.45db 


31.90db 


30.51db 


13 x 13 


32.85db 


31.33db 


31.49db 


32.92db 


31.34db 


15 x 15 


33.74db 


32.27db 


32.37db 


33.76db 


32.08db 


17 x 17 


34.50db 


33.09db 


33.16db 


34.48db 


32.74db 


19 x 19 


35.20db 


33.81db 


33.85db 


35.13db 


33.32db 


21 x 21 


35.79db 


34.46db 


34.48db 


35.71db 


33.85db 



and V XOth given by (4). 

Note that A XOiX (y) = \f(y)-f(T-xy)\ and ({y) 
e(T — xy) — e(y). It is easy to see that 



^) = ^ E (m + e(x)-f(x Q )-e(x )f-2a 

III 

<- E (^oAy) + C(y)) 2 -^ 2 

VCl 

= ^ E A lAv) 

ye\J XQ ,r, 

+ ~ E (C(y) 2 -2a 2 + 2A^ x (y)C(y) 



J/6U X0>r) 



= - Y A% (y) + -S(x) 



where 



S ( x ) = E (((y) 2 -^ 2 + 2A X0 , x (y)((y)). (19) 

yev xo , v 

Theorem 2 Assume thatrj — c n~ a ^ ^ 2/3+2 < a < 5) ■ 
Suppose that the function f satisfies the local Holder 
conditions (14) and p xo is given by (25). Then there is 
a constant C\ such that 

max \p 2 X0 (x) - p 2 ftX0 {x)\ > Cl n a -^Vh^) 

X ^ ^ XQ ,h / 

= 0(n- 1 ). (20) 
For the proof of this theorem see Section 6.2. 



In the Theorem 2, we consider that a is a constant 



and the Holder condition (14) implies that 



A 2 X0 ,M 



(X) 

large enough, we have 



/ 2/3 

O ( n 2 /3+ 2 



)■ 



1 



cardU: 



Therefore, 



if n is 



cardU^,^ xo,x(y)-P/,xo( a: ) « 

a. It is to say that the larger the standard deviation of 
the noise is, the more useful our theorem will be. We 
take the test image " Lena" as an example, which is de- 
graded by Gaussian noise with a = 10, a = 20 and 
o" = 30 respectively. We fix the size of search window 
M = 13 x 13, H = 0.4 x er+2 and choose the size of simi- 
larity patch m e {2fc + l : k = 1, 2, • • • , 20}. In the cases 
of n — 20 and a = 30, Figure 1 (b) and (c) illustrate 
that the value of PSNR value of increases when the size 
of a similarity patch increases. The evolutions of PSNR 
value are in accordance with Theorem 2. However, in 
the case of a = 10, Figure 1 (a) displays that the PSNR 
value increases when the size of a similarity patch in- 
creases in the interval [3, 15] and reaches the peak value. 
But it decreases in the interval [15, 41]. This means that 
the value a = 10 is not large enough to satisfy the con- 



dition 



cardU'' 



-Ai , x {y)- p f ,a 



< a. 



In order to improve the results, we sometimes shall 
use the smoothed version of the estimate of brightness 
variation d\ (Yj^, Y XOi?J ) instead of the non smoothed 
one d 2 {Y x , v , Y XOi7) ). It should be noted that for the 
smoothed versions of the estimated brightness varia- 
tion we can establish similar convergence results. The 
smoothed estimator d\ (Y XiTI ,Y XOir) ) = \\Y XiT) - Y XOiT) \\l K 
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Fig. 1 The evolution of PSNR value as a function of the size of a similarity patch. 



is defined by 



IY 



x,rj 



Y 



E K{y)(Y(T-xy)-Y(y)Y 
j/eu xo- ,, 



E <v) 



(21) 



where n(y) are some weights defined on U X0lV . With 
the rectangular kernel 



K r (y) 



0, otherwise, 



(22) 



we obtain exactly the distance d 2 (Y X)7) , Y Xo ^). Other 
smoothing kernels n(y) used in the simulations are the 
Gaussian kernel 



K g (y) = exp 



N 2 \\y-x \\l 

2fla 



(23) 



where h g is the bandwidth parameter, and the following 
kernel: for y e U l0 ,„ 



«o (y) 



JVrj 

E 



i 



(2fc + l) 2 



(24) 



if — a?o||oo = jf for some j e{0,l,--- ,Ntj}. n(y) = 
Ko(y) is used in our paper and Buades et al [1]. 

To avoid the undesirable border effects, we mirror 
the image outside the image limits. In more detail, we 
extend the image outside the image limits symmetri- 
cally with respect to the border. At the corners, the 
image is extended symmetrically with respect to the 
corner pixels. 

The following is the algorithm for denoising used in 
Buades et al[l]. 

Algorithm NL-means (Buades et al [1], 
http://dmi.uib.es/abuades/nlmeanscode.html). 



Let {H, M, m} be the parameters. 
Repeat for each xo € I 
- compute 

dl(Y x ^,Y X0 , v ) (given by (21)) 

and d K (Y a0! ^, Y X0)7) ) = max{<i K (Y Ij)( ,Y I() ^) : 

X ^ X ,X € U Xo ,h} 

cxp(-d 2 K (Y ;c , T ,,Y a;o ,„)/g 2 ) 



W( - X) E HeUxo . h exp(-d»(Y,,„Y. ,,)/fl») 

equation (27)) 

/Oo) = E^ eUxo , h w(x)Y(x) 



(see the 



A detailed theory analysis and the convergence of 
Non-Local Means Filter will be given in Section 3. In 
Section 4, the numerical simulations show that we can 
optimize the parameters to make Non-Local Means Fil- 
ter better. 



3 Convergence theorem of Non-Local means 

Now, we turn to the study of the convergence of the 
Optimal Weights Filter. Due to the difficulty in dealing 
with the dependence of the weights we shall consider 
a slightly modified version of the proposed algorithm: 
we divide the set of pixels into two independent parts, 
so that the weights are constructed from the one part, 
and the estimation of the target function is a weighted 
mean along the other part. More precisely, assume that 
xq € I, h > and r\ > 0. To prove the convergence we 
split the set of pixels into two parts I = I' Xo U I" , where 



4, = \ x o + 



AT' N 



G I : i + j is pair 



and I" o = I\I^ o . Define an estimated similarity func- 
tion pL is given by 
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(25) 

\J Xo . h n i;' with \J Xa . h given by (4). 



where U xo , 
Then an adaptive estimator f' h by 

//U^o) = V w h {x)Y(x), 



XQ,1 



where 



E 



,(»') 



(26) 



(27) 



and U'xoM = U *o 



,/j n l' XQ with U XOi/l given by (4). 
In the next theorem we prove that the Mean Squared 
Error of the estimator /^(xo) converges at the rate 

20 

n 2 ?+ 2 which is the usual optimal rate of convergence 
for a given Holder smoothness (3 > (see e.g. Fan and 
Gijbels (1996 [20])). 



2/3 + 2 



n 2 f>+ 2 



Theorem 3 Let r\ = c n a , h = ^^fj2^ 

and H > 2c\n a ~^ and H > \[2Lh. Suppose that the 
function f satisfies the Holder condition (14) and f' h is 
given by (26). Then 



E 



< 2 



' 20 + 6 40 4 
2 20 + 2 (j 20 + 2 £20 + 2 




20 

n 2 £+ 2 . 



For the proof of this theorem see Section 6.2. 



4 Simulation 

In this section, we compare the performance of the Non- 
Local Means Filter computed using the parameters pro- 
posed in this paper with those proposed in Buades et 
al [1]. The results were measured by the usual Peak 
Signal-to-Noise Ratio (PSNR) in decibels (db) defined 
as 

PSNR=m °^MSE> 
xei 

where / is the original image and fh is the estimated 
one. 

We have done simulations on a commonly-used set 
of images available at http://decsai. 
ugr.es/javier/denoise/test images/. The potential of the 
estimation method is illustrated with the 512 x 512 
"Lena" image (Figure 2(a)) corrupted by an additive 



white Gaussian noise (Figure 2(a) right, PSNR= 22.10d6, 
a = 20) . We have seen experimentally that the filtering 
parameter H can take values between 0.4 x a + 2 and 
0.5 x cr+2, obtaining a high visual quality solution. The- 
orem 3 implies that the search window is of size coa 2 ' 3 + 2 . 
Assuming that f3 = 1, we get a search window of size 
CQ^/a. Experimentations show that when the size of the 
search window takes values 1.5 x ^/a+A.b, we obtain the 
best quality for Non- Local Means Filter. Our simula- 
tions also show that it is convenient to take the similar- 
ity patch size asm= 17x17 for a = 10, and m = 21x21 
for a = 20 and a = 30. In Figure 2(b) left, we can see 
that the noise is reduced in a natural manner and sig- 
nificant geometric features, fine textures, and original 
contrasts are visually well recovered with no undesir- 
able artifacts (PSNR= 32.39d6). To better appreciate 
the accuracy of the restoration process, the square of 
difference between the original image and the recovered 
image is shown in Figure 2(b) right, where dark values 
correspond to high-confidence estimates. As expected, 
pixels with a low level of confidence are located in the 
neighborhood of image discontinuities. For comparison 
we give the image denoised by the Non-Local Means 
Filter with 21x21 search windows and 9x9 similarity 
patches (PSNR= 31.51d6) and its square error, given 
in Figure 2 (c). The overall visual impression and the 
numerical results are improved using our theory. 

In Table 2, we show a comparison of PSNR values 
of Non-Local Means Filter computed with parameters 
propose in Buades et al [1] and with those proposed 
in our paper. It is easy to see that the visual quality 
rises noticeably as the standard deviation a increases. 
Nothing improves in the visual quality for a — 10, but 
it improves with average 0.50c?6 for a — 20 and average 
0.98d6 for a = 30. The comparison with several filters 
is given in Table 3. The PSNR values show that the 
Non-Local Means Filter with proper parameters is as 
good as more sophisticated methods, like [3, 26, 28, 29], 
and is better than the filters proposed in [22-26]. The 
proposed approach gives a denoising quality which is 
competitive with that of the recent method BM3D [16]. 

5 Conclusion 

We have proposed new theorems of Non-Local Means 
Filter, based on optimization of parameters in the weighted 
means approach. Our analysis shows that a small search 
window is preferred rather than the whole image and a 
large similarity patch (to = 21 x 21) is also preferred 
rather than the small similarity patch (to = 7 x 7). 
The proposed theorems improve the usual parameters 
of Non-Local Means Filter both numerically and visu- 
ally in denoising performance. We hope that the con- 
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Table 2 Comparison between the Non-Local Means Filter with Baude's parameters and our parameters. 



image 
Size 


Lena 
512 X 512 


Barbara 
512 x 512 


Boats 
512 X 512 


Bouse 
256 X 256 


Beppers 
256 X 256 


a/PSNR 


10/28. 12db 


10/28.12db 


10/28. 12db 


10/28.11db 


10/28. lldb 


PSNR/Buade 

PSNR/Ours 

Z\PSNR 


34.99db 
35.22db 
0.23db 


33.82db 
33.55db 
-0.27db 


32.85db 
33.00db 
0.15db 


35.50db 
35.35db 
-0.15db 


33.13db 
33.16db 
0.03db 


a/PSNR 


20/22. lldb 


20/22.11db 


20/22. lldb 


20/28.12db 


20/28. 12db 


PSNR/Buade 

PSNR/Ours 

Z\PSNR 


31.51db 
32.39db 
0.82db 


30.38db 
30.62db 
0.24db 


29.32db 
30.02db 
0.70db 


32.51db 
32.57db 
0.08db 


29.73db 
30.30db 
0.57db 


a/PSNR 


30/18. 60db 


30/18.60db 


30/18. 60db 


30/18.61db 


30/18. 61db 


PSNR/Buade 

PSNR/Ours 

Z\PSNR 


28.86db 
30.20db 
1.34db 


27.65db 
28.06db 
0.41db 


27.38db 
28.60db 
1.22db 


29.17db 
30.49db 
1.32db 


27.67db 
28.28db 
0.61db 



Table 3 Performance of denoising algorithms when applied to test noisy (WGN) images. 





images 


Lena 


Barbara 


Boat 


Bouse 


Peppers 




Sizes 


512 X 512 


512 x 512 


512 x 512 


256 X 256 


256 X 256 


a 


Method 


PSNR 


PSNR 


PSNR 


PSNR 


PSNR 




Non-Local Means 














M = 13 x 13 


32.39db 


30.62db 


30.02db 


32.57db 


30.30db 




m = 21 x 21 














Buadcs ct al[21J 


31.51db 


30.38db 


29.32db 


32.51db 


29.73db 




Salmon et al [221 










29.46db 




Katkovnik ct al [23] 


30.74db 


27.38db 


29.03db 


31.24db 


29.58db 


20 


Foi et al [24] 


31.43db 


27.90db 


39.61db 


31.84db 


30.30db 




Roth ct al [25] 


31.89db 


28.28db 


29.86db 


32.29db 


30.47db 




Hirkawa et al [26] 


32.69db 


31.06db 


30.25db 


32.58db 


30.21db 




Kervrann et al [3] 


32.64db 


30.37db 


30.12db 


32.90db 


30.59db 




Jin et al [27] 


32.68db 


31.04db 


30.30db 


32.83db 


30.61db 




Hammond et al [28] 


32.81db 


30.76db 


30.41db 


32.52db 


30.40db 




Aharon ct al [29] 


32.39db 


30.84db 


30.39db 


33.10db 


30.80db 




Dabov ct al [16] 


33.05db 


31.78db 


30.88db 


33.77db 


31.29db 



vergence theorems for the Non-Local Means Filter that and 
we deduced can also bring similar improvements for re- 
ccntly developed algorithms where the basic idea of the 2 — a 
Non-Local means filter is used. 



E * 

2 \\x— xo\\oo<h 



lag 



6 Proofs of the main results 

6.1 Proof of Theorem 1 
Denoting for brevity 

\x£l / 




then we have 

g{w* h {w))=h + h. 



(29) 



(30) 



(31) 



t 2 

Noting that te~"a I , t G [0,H/>/2) is increasing, it is 
easy to sec that 



E e » 

||a5 — 050 ||oo <h 



\\x-xo\\<x<h 



2/3 



L\\x — xq\ 



m p(x) 



\ \\x-X \\oo<h 



(28) 



< L\\x-xo\\L<4Lh^ 

\\x — xq II <h 



(32) 



n. 
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Since e f^jte [0,H/y/2) is decreasing, Using one 
term Taylor expansion, 



E 



e 



|| CC — Xo||oo<fr 



> 



E 



r,2|l,_,„ l |2/3 



|k-2;o||cxj<^ 



* £ I 1 - 

lk-a;o||oo<'i 



i 2 l|x-xol|^ 



if 2 



> 2/iV 



(33) 



The above three inequalities (28), (33) and (32) imply 
that 



h < 4L 2 h 2p . 



(34) 



6.2 Proof of Theorem 2 

First, we prove the following lemma: 

Lemma 1 Suppose that S(x) is given by (19), then 
there are two constants c 2 and c 3 , such that for any 
< z < cam 1 / 2 , 

V(\S(x)\ > zspm) < 2exp(-c 3 z 2 ) . 

Proof Let m is given by (6). Denote £ (y) = ( (y) 2 — 
2a 2 + 2A X0 , X 

(y) C (y) ■ Since ( (y) is a normal random 
variable with mean and variance 2cr 2 , there exist two 
positive constants to and c 4 depending only on /3, L, and 
a 2 such that <p y (t) = Ee 1 ^ < c 4 , for any \t\ < t . Let 
ip y (t) = In (fiy (t) be the cumulant generating function. 
By Chebyshev's exponential inequality we get 



Taking into account the inequality 

£ e - 2 %^< £ l = 4h 2 n, 

(30) and (33), it is easily seen that 



12 ^ vr- 



Combining (31), (34), and (35), we give 



(35) 



(36) 



Let h minimize the latter term of the above inequality. 
Then 



h 6 n 



from which we infer that 



k= \W ] """" 



Substituting (37) to (36) leads to 



2)3 + 6 4)9 4 

flK) < 2^ « 

^5Ff2 



Therefore (12) implies (15). 



(37) 



¥{S(x) > zy/nl} < cxp -ty/7h~z + ^ ip y (t) , 

(38) 

for any |t| < i and for any z > 0. By three term Taylor 
expansion, for \t\ < t , 



Mt) = Mo) + t^' y (o) + -^(et), 

where \6\ < 1, V y (0) = 0, ?// (0) = E£(y) = and 



(39) 



< <W 



(<M*)r 



6 y (t) 



Since, by Jensen's inequality Ee**M > e tE «^) = 1, we 
arrive at the following upper bound 

Using the elementary inequality x 2 e x < e 3x , x > 0, we 
have, for \t\ < to/3, 



ft 



C 4 . 



(40) 



The inequality (40) combining with (39) implies that 
for \t\ < t , 

< il> v (t) < ^t 2 . 

Then (38) becomes 

P (S(x) > zy/m) < exp [~tz^n~i + 7^> mt2 ^j ■ 



10 



Qiyu Jin et al. 



If t = dz/\/rn < to/3, we obtain 6.3 Proof of Theorem 3 

, . / , 2 / 9c4 A\ Taking into account (25), (26), and the independence 

P (S(x) > z^m) < exp ^-c z ^1 - _ c j j . q{ ^ ^ ^ 

Choosing c' sufficiently small we arrive at E {^M - f (x )\ 2 \Y (x) , x e l£ ) < .g'(^), (45) 



where 

P (S(x) > Zsfm) < exp (-c 3 z 2 ) , 2 

for some constant c 3 > 0. In the same way we show ^ ^ — I w ( x )Pf, x o( x ) I + CT w 

, lieu; h / ieu' , 

that x x °' h 7 

From the proof of Theorem 1 , we infer that 
P (S(x) < -zy/m) < exp (-c 3 z 2 ) . , 2(3+6 4/3 4 N 

9 K) < ~ 35 « w • (46) 

This proves the lemma. z \ ) 

By Theorem 2 and its proof, for pf defined by (25), 
Finally, we turn to the proof of Theorem 2. Applying , 

•' ' r- — - — ^ J ° there is a constant ci such that 



Lemma 1 with z = » Inn 2 , we see that 

y c 3 



max 

i£U' , 
i x ,h 



Px (x) ~ Pf, Xo (x) 



> C\n a 2 vlnn 



p {-\S{x)\ > ^ C3 ^ n2 \ < 2exp(-lnn 2 ) = A. = O (n" 1 ) . (47) 

I m Vm J ~ v 7 n 2 , , _^ 

\ / Let B = (max l£U - oi p x 2 o (a:) - p) XQ {x) < cm a .j. 

From this inequality we easily deduce that 0n thc sct B > w ^ have ^Lo^) ~ c i nQ ^ < Px i x ) < 

P 2 f,x ( x ) + cin a ~2 , from which we infer that 



^ W ^ In n 2 

max — |<5(a;)| > - 



m 



■Wft(x) 



, , «/— lnn 2 \ n X^eu' . 

< v p - >^v- 



e h-> 



xeu' . 

x ,h 



< -. 

n 



< 



Taking m = (2iVn + l) 2 = c 2 ™ 1 - 2 ", we arrive at ^ "Lo (l '' +c '' 1 ° 2 



e ( l + 2^i 

< 



max — |5*(x)| > cm ^) < O (n^ 1 ) . (41) 
The local Holder condition (14) implies that J^yeu'^ h e I 1 — — 

Pl (x) P 2 (x)\ < O (n-^) + 1 |5(x)| . (42) = ( I±=j£ 



cin 2 



2eirt° 2 
H 2 

a- 1 

cin 2 



m 

'I 1 ' 

Combining (41) and (42), we get Thig impHes that 



Wu 



2 



9'(w h ) < ^ | g'(w* h ). 

<0{n~'). (43) Vl-^V" 

n-a , i+ 1 . , Consequently, thc inequality (45) becomes 

Because the condition 2 p+2 < a < 1 lm P ncs that 

K(\fL(xo)-.f(x )\ 2 \Y(x),xeK 01 B) 
n^-hyn-^r^ (44) / 

^ ~?T 9'K)- (48) 

the inequality (20) holds. y 1 - c i ? ^ 2 H 
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Since the function / satisfies the local Holder condition 
(14), 

E (\K(xo) - f(x )\ 2 \Y(x),xe < g'(w h ) < c 2 , 

(49) 

for a constant c 2 > depending only on (3, L, and a. 
Combining (20), (48), and (49), we have 

^{\K(x ) - f(x Q )\ 2 \Y(x),x el': ,) 
=E (\? h (x ) - f{x )\ 2 \Y(x),x€ I£ ,b) P(B) 
+ E (\f> h {x ) - f(x )\ 2 \Y(x),x e IL' ,B) P(B) 

(1 4- 2cm°~^ \ 2 

Now, the assertion of the theorem is obtained easily if 
we note the inequality (46). 
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